Skip to content

feat: configurable vector datatype (int8 quantization) for long-term memory#302

Open
silversurfer562 wants to merge 2 commits into
redis:mainfrom
silversurfer562:feat/configurable-vector-datatype
Open

feat: configurable vector datatype (int8 quantization) for long-term memory#302
silversurfer562 wants to merge 2 commits into
redis:mainfrom
silversurfer562:feat/configurable-vector-datatype

Conversation

@silversurfer562

@silversurfer562 silversurfer562 commented Jun 4, 2026

Copy link
Copy Markdown

What

Adds a REDISVL_DATATYPE setting so the long-term-memory vector
index can use int8 (and the other RedisVL datatypes) instead of
the currently hardcoded float32.

Why

On a Redis 8 Query Engine, int8-quantized vectors cut index memory
by ~75% and speed search ~30% with negligible recall loss. Today the
datatype is hardcoded to float32 in _build_redis_schema and the
write path, so there is no way to opt in. (dims,
distance_metric, and algorithm are already settings-driven —
this brings datatype in line.)

How

  • config: new redisvl_datatype setting, default "float32".
  • factory: _build_redis_schema reads settings.redisvl_datatype;
    create_redis_memory_vector_db passes it to the DB.
  • vector db: a datatype constructor arg (default float32).
    Encoding/queries honor it: float types go through RedisVL's
    array_to_buffer; int8 is quantized first via per-vector max-abs
    scaling (RedisVL validates the int8 range but does not quantize),
    which COSINE is invariant to. Query vectors are quantized to match
    and the VectorQuery/RangeQuery dtype is set accordingly.

Compatibility

Default is unchanged (float32); existing deployments are
unaffected. int8 requires the Redis 8 Query Engine (TYPE INT8
vector fields); older servers reject it at index creation.

Testing

  • 6 new tests (quantization math, encoding byte-width, config
    default, schema datatype). Full tests/test_memory_vector_db.py
    passes (33).
  • Validated end-to-end against redis:8 + Ollama
    (nomic-embed-text, 768-dim): int8 index created, correct
    semantic ranking; float32 default path unchanged.

Note

Medium Risk
Changes core long-term memory indexing and search encoding; wrong datatype or re-indexing without migration could break existing vector indexes, though the default path is unchanged.

Overview
Adds redisvl_datatype (default float32) so long-term memory vector indexes are no longer hardcoded to float32. The Redis schema, factory, and RedisVLMemoryVectorDatabase now honor the setting end-to-end: index creation, writes (array_to_buffer + optional int8 per-vector max-abs quantization), and semantic/hybrid/recency queries (quantize query vectors and pass dtype on VectorQuery/RangeQuery/hybrid).

Validation: Pydantic checks values against RedisVL VectorDataType (case-normalized). int8/uint8 require cosine distance at schema build time because quantization breaks other metrics.

Default remains float32; opting into int8 needs Redis 8 Query Engine support. New unit tests cover quantization, encoding, config, and schema rules.

Reviewed by Cursor Bugbot for commit d0955b8. Bugbot is set up for automated code reviews on this repo. Configure here.

Adds a REDISVL_DATATYPE setting so the long-term-memory vector index
can use int8 (and other RedisVL datatypes) instead of the hardcoded
float32. int8 cuts index memory ~75% and speeds search ~30% with
negligible recall loss (Redis 8 Query Engine required for TYPE INT8).

- config: new redisvl_datatype setting (default "float32")
- factory: _build_redis_schema uses settings.redisvl_datatype, and
  passes it to RedisVLMemoryVectorDatabase
- vector db: encode/query honor the datatype. Float types go through
  RedisVL's array_to_buffer; int8 is quantized first (per-vector
  max-abs scaling — RedisVL validates the int8 range but does not
  quantize). Query vectors are quantized to match and the
  VectorQuery/RangeQuery dtype is set accordingly.

Default behavior is unchanged (float32). Adds 6 tests; full
test_memory_vector_db.py suite passes (33).
Copilot AI review requested due to automatic review settings June 4, 2026 05:18
@jit-ci

jit-ci Bot commented Jun 4, 2026

Copy link
Copy Markdown

Hi, I’m Jit, a friendly security platform designed to help developers build secure applications from day zero with an MVS (Minimal viable security) mindset.

In case there are security findings, they will be communicated to you as a comment inside the PR.

Hope you’ll enjoy using Jit.

Questions? Comments? Want to learn more? Get in touch with us.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds support for configuring the RedisVL vector datatype (default float32, optional int8) and introduces int8 quantization/encoding so the stored vectors and query vectors match the configured datatype.

Changes:

  • Add redisvl_datatype setting and thread it into Redis schema construction and DB instantiation.
  • Implement int8 per-vector max-abs quantization and unified vector byte encoding via array_to_buffer.
  • Add tests covering default datatype behavior, quantization, encoding, and schema wiring.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
tests/test_memory_vector_db.py Adds tests for datatype defaulting, quantization/encoding behavior, and schema datatype propagation.
agent_memory_server/memory_vector_db_factory.py Wires settings.redisvl_datatype into schema creation and DB construction.
agent_memory_server/memory_vector_db.py Adds datatype parameter, quantization, encoding, and passes dtype through to RedisVL queries.
agent_memory_server/config.py Introduces redisvl_datatype setting with default float32.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

redisvl_vector_dimensions: str = "1536"
redisvl_index_prefix: str = "memory_idx"
redisvl_indexing_algorithm: str = "HNSW"
redisvl_datatype: str = "float32"
Comment on lines +418 to +423
"""Quantize a float embedding to int8 range for an int8 index.

RedisVL validates the int8 range but does not quantize; float
datatypes pass through unchanged. Per-vector max-abs scaling is
used, which COSINE distance is invariant to.
"""
Comment thread agent_memory_server/memory_vector_db.py Outdated
Comment on lines +426 to +429
arr = np.asarray(embedding, dtype=np.float32)
peak = float(np.max(np.abs(arr))) or 1.0
scaled = np.clip(np.round(arr * (127.0 / peak)), -127, 127)
return scaled.astype(np.int8).tolist()
Comment thread tests/test_memory_vector_db.py Outdated
Comment on lines +783 to +790
original = settings.redisvl_datatype
try:
settings.redisvl_datatype = "int8"
schema = _build_redis_schema()
vec = next(f for f in schema["fields"] if f.get("type") == "vector")
assert vec["attrs"]["datatype"] == "int8"
finally:
settings.redisvl_datatype = original
…ypatch

Addresses Copilot review feedback:
- config: field_validator normalizes redisvl_datatype to lowercase and
  validates against RedisVL's VectorDataType set (rejects e.g. 'float').
- factory: raise a clear ValueError when a quantized datatype (int8/
  uint8) is paired with a non-cosine distance metric, since per-vector
  max-abs scaling changes geometry for L2/IP.
- vector db: _maybe_quantize returns an np.int8 array (no .tolist()
  boxing); array_to_buffer consumes it directly.
- tests: use monkeypatch instead of mutating global settings; add tests
  for the validator (normalize + reject) and the int8/cosine guard.
@silversurfer562

Copy link
Copy Markdown
Author

Thanks for the review — all four points addressed in d0955b8:

  1. Unvalidated datatype string — added a field_validator on
    redisvl_datatype that validates against RedisVL's
    VectorDataType set and normalizes to lowercase, so the rest of
    the code can assume a canonical value (INT8/float now fail at
    the config boundary).

  2. int8 + non-cosine metric_build_redis_schema now raises a
    clear ValueError when a quantized datatype (int8/uint8) is paired
    with a non-cosine redisvl_distance_metric, since per-vector
    max-abs scaling changes geometry for L2/IP. (New test covers it.)

  3. .tolist() boxing_maybe_quantize now returns the
    np.int8 array directly and array_to_buffer consumes it without
    the intermediate list.

  4. Shared global settings in tests — switched to pytest's
    monkeypatch fixture.

Full tests/test_memory_vector_db.py passes (36), ruff clean, and I
re-validated both arms end-to-end against redis:8 + Ollama
(int8 and float32 default) after the changes.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Reviewed by Cursor Bugbot for commit d0955b8. Configure here.


def _encode_vector(self, embedding: Any) -> bytes:
"""Encode an embedding to bytes for the configured datatype."""
return array_to_buffer(self._maybe_quantize(embedding), dtype=self._datatype)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uint8 config lacks quantization

Medium Severity

redisvl_datatype can be set to uint8 (validated like other RedisVL types), and _build_redis_schema treats uint8 as quantized, but _maybe_quantize only scales for int8. Indexing and search then pass raw float embeddings through array_to_buffer with dtype=uint8, so stored/query vectors won’t match a proper uint8 index.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit d0955b8. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants